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(54)TMe: METHOD FOR CONTROLLING THE DISTRIBUTION OF DNA SEQUENCING TERMINATION PRODUCTS 
(57) Abstract 

A method for sequencing DNA is presented which allows for a more even distribution of short and long strands of synthesized DNA 
This results in a more even peak height allowing the reading of more sequence. The method uses a combination of at least two different 
types of DNA polymerase wherein a first DNA polymerase discriminates highly between dNTPs and ddNTPs and consequently incorporates 
ddNTPs only rarely and wherein a second DNA polymerase does not discriminate to a large extent between dNTPs and ddNTPs and will 
incorporate both nucleotides at approximately equal rates. The method further can incorporate cycle sequencing. The first DNA polymerase 
results in long pieces of DNA which can be further elongated or terminated by the second DNA polymerase in subsequent rounds of 
sequencing thereby allowing for more effective incorporation of label into longer pieces of DNA than is accomplished by most prior art 
methods of sequencing. Use of a proper ratio of the two DNA polymerases and use of a proper ratio of dNTPs to ddNTPs results in a more 
even distribution of label across a range of short to long strands of synthesized DNA than is accomplished by prior art methods thereby 
enabling a better reading of longer stretches of sequence from a single reaction or set of reactions. 
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TITLE OF THE INVENTION 

METHOD FOR CONTROLLING THE DISTRIBUTION OF DNA SEQUENCING 
TERMINATION PRODUCTS 

5 BACKGROUND OF THE INVENTION 

DNA sequencing has become a very widely used technique since its introduction in 1 977. 
It is used in many ways. For example it can be used to determine the presence or absence of 
mutations known to be associated with disease states thereby allowing a physician to make a 
diagnosis or prognosis. It is expected that genotyping will be used in the future by physicians to 

10 determine which drugs to prescribe. Sequencing is used to find new genes and their encoded 
proteins and to perform comparisons between various species for evolutionary analysis. 
Sequencing can be used as part of a paternity test or forensic analysis. DNA sequencing is also 
being used as the technique of determining the sequence of the human genome as well as 
genomes of other organisms such as bacteria, yeast, slime mold, roundworms and fruit flies. 

1 5 Two different methods of sequencing DNA were introduced in 1 977. A paper by Maxam 

and Gilbert (1977) introduced a method wherein four sets of chemical reactions are performed 
with each reaction preferentially cleaving next to one or two of the four different types of 
nucleotides which occur naturally in DNA. Following these chemical reactions, which are 
carried out only long enough to allow partial cleavage of the many molecules present, the DNA 

20 is electrophoresed on a gel which separates the cleavage products by size. The results can be 
analyzed to determine the sequence of nucleotides in the DNA. 

Also in 1977 a paper was published by Sanger et al. (1977). This presented a completely 
different method of sequencing DNA. In this method a DNA template is mixed with dNTPs and 
dideoxy NTPs (ddNTPs) as well as with DNA polymerase and a buffer. Four separate reactions 

25 were performed wherein only a single ddNTP was used in each reaction. The DNA polymerase 
would synthesize a new strand of DNA based upon the template which was present. The 
polymerase would randomly choose between, e.g., dATP and ddATP. Whenever a ddNTP was 
inserted, the DNA strand being synthesized could no longer be elongated. Some synthesized 
strands would be terminated quickly because a ddNTP was inserted early in the reaction whereas 

30 other strands of newly synthesized DNA would be longer because by chance no ddNTP was 
incorporated for a long time. Because many strands of DNA were being synthesized, every size 
of DNA was being formed with every one ending in a ddNTP. Usually one of the dNTPs or the 
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ddNTP was radioactively labeled. After performing the four separate reactions, the products 
were electrophoresed side by side on a polyacrylamide gel and then an autoradiogram of the gel 
was made. The resulting bands allowed one to easily determine the sequence of the template 
DNA. 

The Sanger method is the more widely used of the above two methods. Originally the 
method allowed for the determination of approximately 200 base pairs of sequence from a single 
set of reactions. Over time many improvements have been made and now sequences of over 
1000 bases can be read from a single sequencing reaction. The buffer system has been improved 
and a wide variety of different DNA polymerases from different species have been introduced. 
These allowed the determination of many more bases from a single reaction. Also, modified 
dNTPs are sometimes used to read through unusual stretches of DNA, such as long runs of G. 
Another major improvement was the introduction of fluorescent labels which allowed the final 
reactions to all be run on a single lane of a gel rather than side by side in four separate lanes. 
This is possible because four differently colored fluorescent labels are used and can be 
distinguished from each other. This has further allowed automation to become very widely used. 
The introduction of automated DNA synthesizers was a further important advance because it 
allowed the easy manufacture of primers which can be used to walk along a long piece of DNA 
which is too long to be sequenced in a single reaction. The introduction of cycle sequencing 
further increased the usefulness and sensitivity of DNA sequencing. 

Despite the many improvements made to the Sanger method of sequencing since it was 
introduced over 20 years ago, there is still room for and a desire for yet more improved 
techniques. The instant invention addresses one of the problems which still exists with the 
Sanger method of sequencing. It has still been a problem to achieve high signal intensity a long 
distance from the primer. Most DNA polymerases do not use ddNTPs very efficiently as 
compared to dNTPs whereas some polymerases such as AmpliTaq® DNA Polymerase, FS 
(Perkin-Elmer) have been engineered to use dNTPs and ddNTPs relatively equally. Wild-type 
bacteriophage T7 DNA polymerase efficiently incorporates both dNTPs and ddNTPs whereas 
wild-type E. coli DNA polymerase and wild-type Thermits aquaticus DNA polymerase (Taq) 
discriminate greatly against ddNTPs. This difference in discrimination has been attributed to a 
single amino acid residue of the polymerases. Replacing Tyr-526 of T7 DNA polymerase with 
phenylalanine increases the discrimination against the four ddNTPs by 2000-fold while replacing 
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the phenylalanine at the homologous position in the E. coli DNA polymerase I or T. aquaticus 
DNA polymerase with tyrosine decreases discrimination against the four ddNTPs by 250-8000 
fold (Tabor and Richardson, 1995). Brandis et al. (1996) report that Taq DNA polymerase is 
biased in favor of dATP over ddATP by about 700 to 1 whereas T7 DNA polymerase showed 
a preference of 4 to 1 . 

The problem which is addressed by this disclosure is how to obtain both short and long 
fragments of DNA from the sequencing reactions. If the ratio of ddNTPs to dNTPs is too high 
most synthesized strands will quickly be terminated and it will not be possible to read very much 
sequence. In practice it is common that the shorter strands are produced in greater number and 
result in much higher signal as compared to the longer strands synthesized in the sequencing 
reactions. This is especially true when the label is incorporated in the primer or in the ddNTP 
thereby causing each strand to be labeled to an equal degree regardless of the length of the strand. 
Consequently, the longer strands give a much weaker signal making it difficult to read the 
sequence out as far as desired. This invention overcomes this problem by using a combination 
of DNA polymerases wherein a first polymerase incorporates dNTPs at a much higher efficiency 
than it does ddNTPs and the second polymerase incorporates dNTPs and ddNTPs with 
approximately equal efficiency or incorporates ddNTPs at a higher efficiency than dNTPs. The 
combination of DNA polymerases either used in single temperature or cycle sequencing method 
allows for a more even synthesis of all sizes of terminated DNA products over a long range of 
sizes. The result is that one can more accurately determine long stretches of sequence in a single 
reaction as compared to the prior art methods. 

SUMMARY OF THE INVENTION 

The present invention is a modification of the Sanger method of DNA sequencing. The 
modification allows the determination of long stretches of DNA sequence with a single reaction. 
This is accomplished by using a combination of DNA polymerases rather than a single DNA 
polymerase as has been done in the prior art. The combination of DNA polymerases allows for 
the synthesis of more equal amounts of all sizes of products from short to long than occurs with 
the prior art methods. Because relatively more of the longer sized products can be synthesized 
than occurred using the prior art methods, it is possible to more accurately determine longer 
amounts of sequence via a single reaction than could previously be accomplished. 
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DEFINITIONS 

"Different DNA polymerases" means that the polymerases discriminate differently from 
each other as to their relative ability to incorporate dNTPs vs. ddNTPs into DNA. 

"To incorporate more efficiently" means that the polymerase incorporates the favored 
5 nucleotide at higher efficiency at a specific site than the disfavored nucleotide. For example, if 
it is stated that the DNA polymerase incorporates dNTPs into DNA at least 20-fold more 
efficiently than it incorporates ddNTPs into DNA, this means that the polymerase is 20 times 
more likely to bind and insert a dNTP than the ddNTP at a specific site under the reaction 
conditions being used. 

•0 "Relatively equal efficiency" or "similar efficiency" as applied to the relative ability of 

a DNA polymerase to insert a dNTP vs. its ability to insert a ddNTP means that the polymerase 
will insert a dNTP no more than 10-fold more efficiently than it inserts a ddNTP and that the 
polymerase will insert a ddNTP no more than 10-fold more efficiently than it inserts a dNTP. 

15 DETAILED DESCRIPTION OF THE INVENTION 

Sanger dideoxynucleotide termination chemistry (Sanger method) is the most widely used 
method to determine the sequence of bases along a length of DNA. The components in the 
Sanger method include: dideoxynucleotide, deoxynucleotides, DNA polymerase, reaction buffer, 
primer and template. DNA polymerase extends the length of DNA from a primer when given 

20 the appropriate template, buffer and deoxynucleotides. The basic mechanism by which the 
Sanger method works is the termination of elongation at a specific position by the incorporation 
of a dideoxynucleotide in place of a deoxynucleotide by the DNA polymerase. Typically, only 
a single type of dideoxynucleotide is used in a reaction, so four different reactions are required 
to determine the DNA sequence. However, if the four different dideoxynucleotides are 

25 fluorescently labeled with different fluors, then the reaction may be performed in a single tube. 
The ratio of dideoxynucleotide to deoxynucleotide determines the probability that termination 
will occur at a particular base. This probability is also dependent upon the efficiency with which 
the DNA polymerase is able to incorporate the dideoxynucleotide. Most DNA polymerases 
cannot incorporate dideoxynucleotides efficiently. Thermostable DNA polymerases such as 

30 native Taq DNA polymerase inherently do not use dideoxynucleotides efficiently. Therefore to 
use native Taq DNA polymerase in a DNA sequencing reaction the concentration of 
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dideoxynucleotide is increased to an empirically determined level DNA polymerases such as 
AmpliTaq® DNA Polymerase, FS have been engineered so that they discriminate very little 
between dideoxynucleotides and deoxynucleotides. 

DNA elongation termination is controlled by the relative effective concentration of 

5 dideoxynucleotides to deoxynucleotides and the processivity of the DNA polymerase. The 
probability that the DNA polymerase will incorporate a dideoxynucleotide vs. a deoxynucleotide 
is dependent upon their relative effective concentration. This probability starts at the base 
immediately adjacent to the 3' end of the primer and follows an exponential decay. Therefore 
most of the signal (i.e., termination) will be near the beginning of the sequencing primer. This 

10 makes it difficult to obtain sequence information farther from the primer. Being able to read 
more sequence from a single reaction is highly desirable. Yet, without any special treatment of 
the sequencing reaction, the farther away that the reaction gets from the sequencing primer the 
higher the probability that a termination event will have already occurred. Effectively, the signal 
intensity for sequences far away from the sequencing primer is low. 

15 The kit Sequenase from U.S. Biochemical (USB) in the late 1 980s solved this problem 

by adding an elongation step prior to the termination step. The sequencing primer is extended 
for a short period of time in a first step which includes dNTPs but does not include ddNTPs. 
This step produces elongated DNA strands from the sequencing primer. The 3* end of each of 
these elongated strands is a deoxynucleotide and therefore all of these elongated strands are still 

20 templates for further elongation. The length of elongation is dependent upon the processivity of 
the DNA polymerase and this processivity may be modified by varying the ionic strength. After 
the elongation step a second step which is a termination reaction is performed. 
Dideoxynucleotides are used in this second step and they cause sequence defined termination to 
occur. However, because the Sequenase kit uses a two step process it cannot be used in a high 

25 throughput environment. This is because it requires a second addition of reagents. 

The disclosed method takes advantage of the differential ability of DNA polymerases to 
incorporate dideoxynucleotides to produce a system where the distribution of DNA sequencing 
products can be controlled. This new sequencing system uses two different DNA polymerases. 
A first DNA polymerase is one that cannot efficiently incorporate dideoxynucleotides and/or may 

30 be incapable of incorporating ddNTPs. A second DNA polymerase is able to readily incorporate 
dideoxynucleotides and may incorporate dNTPs and ddNTPs relatively equally or it may favor 
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ddNTPs. Both of these DNA polymerases may also be thermostable, and may be used in cycle- 
sequencing reactions. 

The first DNA polymerase may not incorporate dideoxynucleotides efficiently, so for this 
first DNA polymerase the effective concentration of dideoxynucleotides is low and this DNA 
polymerase acts as if it only extends the sequencing primer. The 3' ends of these extended 
products most of the time will contain a deoxynucleotide and therefore they are still a substrate 
for DNA polymerase extension. The second DNA polymerase is able to incorporate 
dideoxynucleotides and deoxynucleotides with more similar efficiency than the first DNA 
polymerase or alternatively it may favor ddNTPs either slightly or up to a large extent even to the 
exclusion of incorporating dNTPs. So the second DNA polymerase will generate termination 
products. Both DNA polymerases can be thermostable so that the extension-termination steps 
may be repeated several times. The thermocycie profile includes denaturation of the template 
at an elevated temperature, annealing of the sequencing primer and extension-termination. This 
method will distribute the signal intensity more evenly throughout the sequence and will allow 
one to move and control the distribution of the signal intensity (e.g., move the distribution farther 
away from the sequencing primer). The distribution of the signal intensity is dependent upon the 
ability of the two DNA polymerases to use the dideoxynucleotide and their processivity (number 
of nucleotides added per enzyme-substrate (primentemplate) binding event). Control of the 
signal intensity along the DNA sequence is effected by manipulating the concentration of the first 
DNA polymerase relative to the second DNA polymerase, the concentration of the DNA 
polymerases in the sequencing reaction and the relative concentration of the dideoxynucleotides 
to the deoxynucleotides. 

In another aspect of the invention, the sequencing reactions with two different 
polymerases can be performed in separate tubes and then combined before gel analysis. 

The present invention is further detailed in the following Example, which is offered by 
way of illustration and is not intended to limit the invention in any manner. Standard techniques 
well known in the art or the techniques specifically described below are utilized. 
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EXAMPLE 

Protocol for Obt aining More Even Signal Intensities Along Sequencing Reaction 

Prepare a 5X PCR buffer of 50% sucrose, 50 mM Tris pH 8.1, 250 mM KC1, 0.05% 
Tween 20, 5 mM EDTA, 3 mM MgCl 2 . Prepare four separate deoxynucleotides/ 
dideoxynucleotide (d/ddNTP) mixes. Each mix is 7.2 mM of each of four deoxynucleotides 
(dATP, dGTP, dCTP, dTTP) and 56 nM of a single dideoxynucleotide. 

Mix the following reagents in four tubes labeled with AF, CF, GF, TF for the four 
termination reactions (sufficient for eight reactions). Native Taq DNA polymerase, Stoffel 
fragment or any DNA polymerase of the first type (those which favor dNTPs over ddNTPs) may 
be substituted for Platinum Taq. Thermosequenase or any DNA polymerase of the second type 
(those which use dNTPs and ddNTPs relatively equally or which favor ddNTPs) may be 
substituted for AmpliTaq® DNA Polymerase, FS. 



Reagents 


AF 


CF 


» 

GF 


TF 


5X PCR buffer 


16 uL 


16 uL 


32 uL 


32 uL 


Water 


15 uL 


15 uL 


30 uL 


30 uL 


AmpliTaq®, FS 
(12 Units/uL) 


0.5 uL 


0.5 uL 


1 uL 


1 uL 


Platinum Taq 


0.16 unit 


0. 1 6 unit 


0.16 unit 


0.16 unit 


FET primers* 
(0.4 uM) 


8uL AF 


8uL CF 


8uL GF 


8 uLTF 


d/ddNTP mix 


d/ddATP 0.5 uL 


d/ddCTP 0.5 uL 


d/ddGTP 1 uL 


d/ddTTP 1 uL 



15 



20 



25 



30 



♦FET stands for fluorescence energy transfer. These are sold by Amersham. 

Dispense 2 \iL of the AF and CF and 4 uL of the GF and TF mixes into individual wells 
of a 384 well microtiter plate. Add 2 \xL of the sequencing template (pGEM 3ZF(+) at 0.2 
Hg/nL) into the AF and CF mixes and 4 \iL of the same template into the GF and TF mixes. 
Overlay 4 \iL of silicone oil into each well and centrifuge to bring the reagents to the bottom of 
the well. Cover the plate with a silicone gasket and thermocycle. The thermocycle profile is 
94 °C for 20 minutes; 56 °C for 30 minutes; 72 °C for 1 minute for 32 cycles, hold at 72 °C for 1 
minute, then keep at 4°C. 

After the thermocycling, consolidate the AF, CF, GF and TF reaction for each template. 
Add 0.1 volume of 7 M NH 4 OAc and 2,5 volumes of 95% ethanol. Mix well and pellet the 
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precipitated sequencing reactions by centrifugation. Wash the pelleted sequencing product with 
70% ethanol and allow to dry. The sequencing product can then be resuspended in formamide 
and loaded onto an automated fluorescent sequencer. 

While the invention has been disclosed in this patent application by reference to the 
details of preferred embodiments of the invention, it is to be understood that the disclosure is 
intended in an illustrative rather than in a limiting sense, as it is contemplated that modifications 
will readily occur to those skilled in the art, within the spirit of the invention and the scope of the 
appended claims. 



WO 00/42223 PCT/US00/00849 

9 

LIST OF REFERENCES 
Brandis JW, et al. (1996). Biochemistry 35:2 1 89-2200. 
Maxam AM and Gilbert W (1977). Proc. Natl. Acad. Sci. USA 74:560-564. 
Sanger F, Nicklen S and Coulson AR (1977). Proc. Natl. Acad Sci. USA 74:5463-5467. 
Tabor S and Richardson CC (1995). Proc. Natl. Acad. Sci. USA 92:6339-6343. 



WO 00/42223 PCT/US00/00849 

10 

WHAT IS CLAIMED IS: 

1 . A method for sequencing DNA wherein two or more different DNA polymerases are 
utilized comprising a first DNA polymerase and a second DNA polymerase. 



2. The method of claim 1 wherein said first DNA polymerase incorporates dNTPs into DNA 
at least 20-fold more efficiently than it incorporates ddNTPs into DNA. 

3. The method of claim 1 wherein said first DNA polymerase incorporates dNTPs into DNA 
at least 1 00-fold more efficiently than it incorporates ddNTPs into DNA. 

4. The method of claim 1 wherein said first DNA polymerase incorporates dNTPs into DNA 
at least 8000-fold more efficiently than it incorporates ddNTPs into DNA. 

5. The method of claim 1 wherein said second DNA polymerase incorporates dNTPs and 
ddNTPs into DNA with similar efficiency. 

6. The method of claim 1 wherein said second DNA polymerase incorporates ddNTPs into 
DNA at least 20-fold more efficiently than it incorporates dNTPs into DNA. 

7. The method of claim 1 wherein said second DNA polymerase incorporates ddNTPs into 
DNA at least 100-fold more efficiently than it incorporates dNTPs into DNA. 

8. The method of claim 1 wherein said second DNA polymerase incorporates ddNTPs into 
DNA at least 8000-fold more efficiently than it incorporates dNTPs into DNA. 

9. The method of claim 1 wherein said sequencing is performed at a constant temperature. 

10. The method of claim 9 wherein said sequencing comprises a step of DNA chain 
termination wherein said step is a maximum of 30 minutes. 



11. 



The method of claim 10 wherein said step is a maximum of 20 minutes. 
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12. The method of claim 1 0 wherein said step is a maximum of 1 0 minutes. 



13. The method of claim 1 wherein said sequencing comprises cycle sequencing. 

14. The method of claim 1 wherein said DNA polymerases are heat stable to at least 94°C. 

15. The method of claim 1 wherein said sequencing is performed in a single container in the 
presence of both said first DNA polymerase and said second DNA polymerase. 

16. The method of claim 1 wherein said sequencing is performed in two containers wherein 
a first sequencing reaction is performed in a first container in the presence of said first 
DNA polymerase to produce a first set of products and a second sequencing reaction is 
performed in a second container in the presence of said second DNA polymerase to 
produce a second set of products, following which the first set of products and the second 
set of products are mixed with each other and then analyzed. 

17. The method of claim 1 wherein primers comprising a detectable label are used. 

1 8. The method of claim 1 7 wherein said detectable label is a dye or radioactivity. 

19. The method of claim 1 wherein ddNTPs comprising a detectable label are used. 

20. The method of claim 19 wherein said detectable label is a dye or radioactivity. 

21 . The method of claim 1 wherein the processivity of said first DNA polymerase equals the 
processivity of said second DNA polymerase. 

22. The method of claim 1 wherein the processivity of said first DNA polymerase differs 
from the processivity of said second DNA polymerase. 
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A kit for sequencing nucleic acid comprising a first DNA polymerase and a second DNA 
polymerase wherein said first DNA polymerase is different from said second DNA 
polymerase. 

The kit of claim 23 further comprising a buffer. 
!. The kit of claim 23 further comprising dNTPs and ddNTPs. 



The kit of claim 23 wherein said first DNA polymerase and said second DNA polymerase 
are active at temperatures higher than 65 °C. 
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